AMENDMENTS TO THE SPECIFICATION: 

Please amend the paragraph beginning on page 19, line p as follows: 
Figure 3 illustrates an embodiment of the present invention used to set the score threshold 
104 in step 110. In step 301, documents from the training set or subset of the training dataset, 
possibly not overlapping with the subset of the dataset used for term extraction, referred to as the 
thresholding dataset, are scored against the profile vector, and are sorted in descending order 
according to their scores. At each position in the ranked list, a utility value £/,can be computed 
by assuming a threshold that is equal to the score of the document at that position. Therefore, in 
step 302. each position yields a candidate score threshold and a corresponding utility value. 
Thereafter, the "optimal" utility threshold, 9 op t 291 (491) is determined in step 303 as the score 
where the utility is maximum over the thresholding dataset and the "zero" utility threshold, 
9^0,292 (492) is determined in step 304 to be the highest score below 9 0p t 291 (491) where the 
utility is zero or negative (or the lowest score should the utility fail to reach zero). Using the 
optimal utility threshold and the zero utility threshold, a new profile utility threshold is then 
calculated in step 305 by interpolating between the empirical optimal utility threshold and the 
zero utility threshold over the thresholding dataset using a beta-gamma function 306 as follows: 

threshold = a * 0 zero + (1 - a) * 0 opt 
a = /? + (!-/?) 
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A multiplex filter 591 with three element or constituent filters F, 520, 525, and 526, is 
illustrated in Figure 5a. (A multiplex filter is not limited to three constituent filters as given for 
illustration in Figure 5a, rather can consist of / such filters, for any /.) This multiplex filter 591, 
made up of constituent filters 520, 525, and 526, accepts or rejects a document 510 (where 
document 510 is represented in terms of its features as defined above) based on some 
interpretation of the independent scoring of each constituent filter F h That is, each component 
filter 520, 525, and 526 accepts as input the features and associated values that describe the 
document 510 and scores them against the component filter profiles. The individual filter scores 
570, 575, 580 are then aggregated to produce an output 585 using a function 595. Various 
aggregation functions 595 can be used for interpreting the scores of a set of filters 570, 575 and 
580, ranging from some simple combination of binary outcomes (e.g., the sum of the "votes" of 
each filter) to a weighted, possibly non-independent scoring based on the interaction of filters. In 
general, classification of a document 510, Doc, using multiplex filters is based upon the 
following procedure where each component filter is assigned a weight Wgtj (e.g., uniform weight 
or weight proportional to its performance expectation): 
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Please amend the paragraph beginning on page 




line Yl as follows: 



l/^/o8 



o Sowrce features are the features directly provided to the system by the pre-processing 
of the document. These correspond to columns 530c, Therefore, column 510c. for 
example, corresponds to the source feature scores Fin for each document. 



o Derived features that are computed by filters earlier in the stack of filters. In our 
example, these earlier or lower level filters (models) are 510b, 515b and 520b. Each 
of these features is computed by the filter from which it is derived. That is each lower 
level filter (e.g., 510b) processes (scores) each example 535b in the training database. 
In Figure 5c. row 515c. for example, represents document DOC ID 1 . This can result 
in a binary value (shown as a positive output 540b and a negative 545b output) or an 
actual score (that is, in this case, the document 505b is scored against the filter and 
the similarity score taken as the actual score) or both. In the example Figure 5c, for 
explanation purposes, this is limited to the score value. This process results in adding 
a column 520c (corresponding to the result of scoring each document against filter 
510b) to the training dataset where each cell value corresponds to the score between 
each document and the model 5 1 5b. Therefore, the columns 540c represent the 
results of scoring the documents against each model Mj to M m . 



Please amend the paragraph beginning on page line^ as follows: 
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t.t: 



A variant of a cascade filter is depicted in FIG. 6. Here, each component filter 630, 635 
and 640 accepts as input the source features that describe the document 615, along with derived 
features from the output of earlier filters in the ensemble 670, 671 and 672. Note that elements 
615. 645. 646. 650, 655. 660. 665 and 692 of FIG. 6 have functions corresponding to the 
functions of elements 515. 545. 546. 547. 555. 560. 565 and 592 of FIG . 5d respectively. Here 
the output of the previous filter could be the actual score of the document against the filter or a 
classification value (+1M) or both. Note that the information added by the processing score or 
other assessment by a filter ordered earlier in a sequence can be regarded as a new feature in the 
feature discrimination space of a subsequent filter. Such new, possibly abstract, features (such as 
the features 540 illustrated in FIG. 5c) can be exploited by subsequent filters in their training and 
in their processing of documents generally. 



940630-010022 

WAI-2232562vl 
940630 - 010001 



Amendment 




The focus of the construction algorithm for cascade filters is on producing a series of 
filters. The training set used for each filter in the series is chosen based on the performance of 
earlier filters in the series. A preferred embodiment for constructing a cascade filter for an 
information need, T t involves a number of steps and assumes as input two subsets of the training 
dataset, Dl 704, D2 702, which are respectively used for feature extraction and threshold 
optimization. The main steps of the algorithm are outlined in block format in Figures 7 to 11 . 
The algorithm consists of two threads 700 and 701 : the extraction thread 700 and the threshold- 
setting or threshold-optimization thread 701. Each thread results in the construction of its own 
cascade filter, namely, extraction 738 and Copt 739. The algorithm is iterative in nature, whereby 
the first filter in the cascade, CI Extraction *J£ 710, is constructed using the positive topic examples 
in the extraction dataset Dl 804 704. This cascade corresponds to the extraction cascade 
extraction 838. In order to set the threshold for CI Extraction, a second cascade filter (i.e., the 
optimization cascade) 839 is constructed. The first constituent filter S3© 720 in this cascade is a 
copy of CI Extraction $±0 710 and is denoted as Clo P t 820 720. To avoid clutter in Figure 8, the 
Extraction and Opt suffixes are dropped from the component filters names. The threshold 820 
for CI opt *20 720 can be set using any of a number of threshold-setting techniques with respect 
to a specified utility measure over the D2 dataset 802 702. One such method is the beta-gamma 
thresholding algorithm described earlier. The threshold 822 of the CI Extraction filter *W 710 is set 
to the optimized threshold 820 of Clo P t 830 720. Subsequently, the fallout, or remainder 
documents from filter CI 84© 710, which pass through the negative class channel 821 (i.e., 
positive examples from Dl that are rejected by CI Extraction) are used to construct the second filter 
C2 Extraction 930 2il in the cascade, provided various continuation conditions are met. These 
continuation conditions may include one or more of (but not limited to) the following: the 
number of documents in the fallout, or remainder of CI Extraction 8 22 (not shown) , graphically 
depicted in Figure 9 as 922, is greater than a minimum number of documents required to 
construct a filter; the utility of the Cl 0p t 720 graphically depicted in Figure 9 as 921 over the 
optimization dataset is greater than some threshold (e.g., zero). Next, the threshold 932 for C2o p f 
721 can be set using a threshold-setting techniques as described above with respect to a specified 
utility measure over the D2 dataset 702. The threshold of the C2/r w , gg //n„ filter 71 1 is set to the 
optimized threshold 932 of C2n n , 721 . Subsequently, the fallout, or remainder documents from 
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